Feature/instruction aware embeddings #8969

zadzanl · 2025-11-01T19:41:20Z

Description

Adds instruction-aware query prefixes for DeepInfra embedding models to
improve semantic search accuracy. Changes include:

Added queryPrefix support for Qwen3-Embedding models (0.6B, 4B, 8B) with code
search instruction format
Added queryPrefix for intfloat/multilingual-e5-large-instruct model
Added queryPrefix for google/embeddinggemma-300m with task-specific format
Added queryPrefix for BAAI/bge-large-en-v1.5 with passage retrieval format
Reduced MAX_ITEM_TOKENS from 8191 to 512 for compatibility with models
that have 512 token limits

Test Procedure

Confirmed MAX_ITEM_TOKENS reduction is applied in index.ts
Tested getModelQueryPrefix() function returns correct prefixes for each model
Validated existing functionality remains working for all models

Pre-Submission Checklist

Issue Linked: This PR is linked to an approved GitHub Issue.
Scope: My changes are focused on the linked issue.
Self-Review: I have performed a thorough self-review of my code.
Documentation Impact: I have considered if my changes require
documentation updates (see "Documentation Updates" section below).
Contribution Guidelines: I have read and agree to the Contributor Guidelines.

Screenshots / Videos

Documentation Updates

No documentation updates are not required.
Yes, documentation updates are required.

Additional Notes

The query prefixes are model-specific and follow the recommended format
from each model's documentation. This change is backward compatible
and won't affect existing implementations that don't use these specific models.

Get in Touch

Discord: badgambit

Important

Adds instruction-aware query prefixes for embedding models and adjusts token limits for compatibility.

Behavior:
- Adds instruction-aware query prefixes for Qwen3-Embedding models (0.6B, 4B, 8B), intfloat/multilingual-e5-large-instruct, google/embeddinggemma-300m, and BAAI/bge-large-en-v1.5 in embeddingModels.ts.
- Reduces MAX_ITEM_TOKENS from 8191 to 511 in index.ts for model compatibility.
Tests:
- Adds tests in openai-compatible.spec.ts for DeepInfra provider detection and handling, including encoding format and response processing.
- Validates getModelQueryPrefix() function returns correct prefixes for each model.
Misc:
- Updates OpenAICompatibleEmbedder in openai-compatible.ts to handle different encoding formats based on provider type.

^{This description was created by}^{for e9f2e0c. You can customize this summary. It will automatically update as commits are pushed.}

Update branch to latest - 31 Oct 2025

…ld index Added support for DeepInfra-hosted embedding models and fix a critical bug where the 'type' field index was missing in Qdrant, causing "Bad Request" errors during code search operations. Changes: - Added DeepInfra provider detection in OpenAICompatibleEmbedder * Detect DeepInfra URLs (deepinfra.com) * Use 'float' encoding format for DeepInfra, 'base64' for other standard providers * Handle both float array and base64 string embedding responses * Added validation for embedding values (NaN/Infinity checking) - Fix missing Qdrant payload index for 'type' field * Non-existing `type` field causes "Bad Request" during `codebase_search` tool invocation * Create keyword index for 'type' field to support metadata filtering * Resolves "Index required but not found for 'type' field" error - Added 7 DeepInfra embedding model profiles: * Qwen/Qwen3-Embedding-0.6B (1024 dims) * Qwen/Qwen3-Embedding-4B (2560 dims) * Qwen/Qwen3-Embedding-8B (4096 dims) * intfloat/multilingual-e5-large-instruct (1024 dims) * google/embeddinggemma-300m (768 dims) * BAAI/bge-m3 (1024 dims) * BAAI/bge-large-en-v1.5 (1024 dims) - Added some test coverage for DeepInfra * Provider validation * Encoding format tests * Float array and base64 response handling tests * Configuration validation tests Tested with: embeddinggemma-300m, text-embedding-004, multilingual-e5-large

feat: add DeepInfra embedding support and fix missing Qdrant `type` index

…dels - Add queryPrefix support for Qwen3-Embedding models (0.6B, 4B, 8B) - Add queryPrefix for intfloat/multilingual-e5-large-instruct - Add queryPrefix for google/embeddinggemma-300m - Add queryPrefix for BAAI/bge-large-en-v1.5 - Reduce MAX_ITEM_TOKENS from 8191 to 512 for compatibility with models that have 512 token limits (e5-large, bge-large-en-v1.5) Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

…OKENS = 8191`

roomote · 2025-11-01T19:41:39Z

Found 1 issue that needs to be addressed:

Add test coverage for query prefix functionality in openai-compatible.spec.ts

_{Mention @roomote in a comment to trigger your PR Fixer agent and make changes to this pull request.}

roomote · 2025-11-01T19:45:01Z

src/services/code-index/embedders/__tests__/openai-compatible.spec.ts

+			)
+		})
+	})
 })


The query prefix functionality added in this PR (lines 114-137) lacks test coverage. While DeepInfra provider detection and encoding format are tested, there are no tests verifying that getModelQueryPrefix() prefixes are actually being applied to queries. Consider adding tests that verify: (1) prefixes are correctly added for models that require them, (2) double-prefixing is prevented, and (3) texts that would exceed MAX_ITEM_TOKENS after prefixing are handled appropriately.

CommitGambit and others added 8 commits October 31, 2025 19:38

Merge pull request #1 from RooCodeInc/main

759cc2a

Update branch to latest - 31 Oct 2025

Merge branch 'RooCodeInc:main' into main

2e32a59

Merge branch 'RooCodeInc:main' into main

d8af426

Merge branch 'main' into feature/deepinfra-embedding-fix

0f21a6a

Merge pull request #2 from zadzanl/feature/deepinfra-embedding-fix

5e4860c

feat: add DeepInfra embedding support and fix missing Qdrant `type` index

fix: possible off by 1, changed to be similar to previous `MAX_ITEM_T…

e9f2e0c

…OKENS = 8191`

zadzanl requested review from cte, jr and mrubens as code owners November 1, 2025 19:41

github-project-automation bot added this to Roo Code Roadmap and Roo Code Roadmap Nov 1, 2025

github-project-automation bot moved this to New in Roo Code Roadmap Nov 1, 2025

github-project-automation bot moved this to Triage in Roo Code Roadmap Nov 1, 2025

dosubot bot added the size:L This PR changes 100-499 lines, ignoring generated files. label Nov 1, 2025

zadzanl closed this Nov 1, 2025

github-project-automation bot moved this from New to Done in Roo Code Roadmap Nov 1, 2025

github-project-automation bot moved this from Triage to Done in Roo Code Roadmap Nov 1, 2025

dosubot bot added the enhancement New feature or request label Nov 1, 2025

roomote bot reviewed Nov 1, 2025

View reviewed changes

zadzanl changed the title ~~Feature/instruction aware embeddings sync~~ Feature/instruction aware embeddings Nov 1, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feature/instruction aware embeddings #8969

Feature/instruction aware embeddings #8969

Uh oh!

zadzanl commented Nov 1, 2025 •

edited by ellipsis-dev bot

Loading

Uh oh!

roomote bot commented Nov 1, 2025 •

edited

Loading

Uh oh!

roomote bot Nov 1, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Feature/instruction aware embeddings #8969

Feature/instruction aware embeddings #8969

Uh oh!

Conversation

zadzanl commented Nov 1, 2025 • edited by ellipsis-dev bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Test Procedure

Pre-Submission Checklist

Screenshots / Videos

Documentation Updates

Additional Notes

Get in Touch

Uh oh!

roomote bot commented Nov 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

roomote bot Nov 1, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

zadzanl commented Nov 1, 2025 •

edited by ellipsis-dev bot

Loading

roomote bot commented Nov 1, 2025 •

edited

Loading